Retrieving Translation Candidates from Patent Corpora
نویسندگان
چکیده
منابع مشابه
Finding Translation Candidates from Patent Corpus
This paper describes a method for retrieving technical terms and finding their translation candidates from patent corpora. The method improves the reliability of bilingual seed words that measure similarity between a target word and its translation candidates. We conducted an experiment with PAJ (Patent Abstracts of Japan), which is a collection of bilingual patent abstracts written in Japanese...
متن کاملRanking Translation Candidates Acquired from Comparable Corpora
Domain-specific bilingual lexicons extracted from domain-specific comparable corpora provide for one term a list of ranked translation candidates. This study proposes to re-rank these translation candidates. We suggest that a term and its translation appear in comparable sentences that can be extracted from domainspecific comparable corpora. For a source term and a list of translation candidate...
متن کاملRetrieving Lexical Semantics from Multilingual Corpora
This paper presents a technique to build a lexical resource used for annotation of parallel corpora where the tags can be seen as multilingual ‘synsets’. The approach can be extended to add relationships between these synsets that are akin to WordNet relationships of synonymy and hypernymy. The paper also discusses how the success of this approach can be measured. The reported results are for E...
متن کاملRetrieving Knowledge from Technical, Manually Indexed Corpora
In technical elds, experts have manually indexed a huge collection of texts. For this purpose, they used thesauri, which structure sets of available keywords. Approaches in automatic indexing have made extensive use of thesauri. However, our belief is that automated systems do not wholly take into account the experts' knowledge. We thus present a method to extract that kind of knowledge from ma...
متن کاملTranslation Using JAPIO Patent Corpora: JAPIO at WAT2016
Japan Patent Information Organization (JAPIO) participates in scientific paper subtask (ASPEC-EJ/CJ) and patent subtask (JPC-EJ/CJ/KJ) with phrase-based SMT systems which are trained with its own patent corpora. Using larger corpora than those prepared by the workshop organizer, we achieved higher BLEU scores than most participants in EJ and CJ translations of patent subtask, but in crowdsourci...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Natural Language Processing
سال: 2007
ISSN: 1340-7619,2185-8314
DOI: 10.5715/jnlp.14.4_23